Include here an introduction to linear mixed models (https://stats.idre.ucla.edu/other/mult-pkg/introduction-to-linear-mixed-models/).
Wiki (https://en.wikipedia.org/wiki/Mixed_model).
Another reference is the book on Mixed Models by McCollough and Searle (2000/8).
The Higgins reference is: “A re-evaluation of random-effects meta-analysis”. J. R. Statist. Soc. A (2009) 172, Part 1, pp. 137–159.
Some useful background for the following topic is: (doi: 10.2134/agronj2019.02.0135).
This first example is for soybean row spacing. The question is: is there a difference between growing soybean using rows 15 inches apart vs. 30 in apart? The planting density is assumed to be the same so only the configuration of the planting scheme was changed. The experimental design is that of paired strips, but it is organized in different trials located in different farms. Each farm/field/trial has unique soils, management history and experiences their own weather conditions. They are all located in Iowa and each season will exert similar effects (e.g. dry vs. wet). It is common to analyze the data as the response ratio, or the yield of the treatment over the yield of the control. Taking the log is common practice.
\[ RR = \frac{Yield_T}{Yield_C} \]
\[ lrr = \log(RR) \]
soy <- read.csv("./ch1/srs.csv")[,c(1,4,7,9)]
## Select only a subset of data
soy$lrr <- log(soy$TRT1_Yld/soy$TRT2_Yld)
soy.o <- soy[order(soy$lrr),]
soy.o$id <- 1:nrow(soy.o)
Visually, it is hard to make a definite statement, but it would seem that the practice is highly beneficial. We could calculate how many observations were positive and see that around 60% of the observations have a positive outcome. Is this enough to recommend the use of the practice? Answer: no, an economic analysis would be a better tool to make decisions.
round(dim(subset(soy.o, lrr > 0))[1]/nrow(soy.o),2)
## [1] 0.6
There is a clear break which suggests that some of these trials are quite different from the rest. Which variable explains these differences? Almost all of the high values were from year 2014. In addition, we should also investigate how do the observations group with respect to studies.
Almost all of the observations which have a large response are from a single study. It is not uncommon to throw this out and call it an ‘outlier’, but we don’t have a reason for doing this at the moment. What if it truly represents a situation that can occurr again? We should consider it as part of the decision making process.
The variability in the confidence intervals is a reflection of both the underlying data and the sample size within each trial. Trials with a smaller sample size tend to have wider confidence intervals.
## Trials sig. > 0: 0.11
## Trials sig. < 0: 0.06
## Trials incl. 0: 0.83
Now only 11% of the trials had a significant positive response, about 6% had a significant negative response and the rest, 83%, had a response for which the confidence interval included zero. This is much less promising than the 60% we observed previously, but now we are counting the number of trials which had a positive reponse instead of the number of observations. It is somewhat ironic that the trial with the highest number of reps (the one with the highest precision) is the one that deviates the most with the observed response in the rest of the trials. This trial experienced hail, which is likely involed in the observed response. One possible explanation for the one trial which had a negative response could be due to higher moisture in the canopy (in narrow rows) which leads to higher disease pressure, particularly white mold.
The disadvantage of the previous analysis is that we are not analyzing the data together. If we want to make a statement about the overall performance of the management practice, combining the trials is necessary. In the following steps we will gradually increase the complexity of the analysis. The first model (lm0) does not include trial, it analyzes the data ignoring its structure.
The concept of a prediciton interval was introduced in Chapter 0. In this context, it represents a statement about a trial which is not part of the sample we collected. That is, we want an interval that will contain a trial in the future with 95% probability (or a different alpha-level). This is not trivial in this context. For example, Partlett et al. (2017) show that the prediction interval as defined by Higgins et al (2009) does not perform well when the between trial variance is equal or less than the within trial variance. Many variations on the interval using different methods did not present coverage close to the stated alpha-level (e.g. 95%).
## Fitting a first model without trial, just the intercept
soy.lm0 <- lm(lrr ~ 1, data = soy.o)
## Intercept parameter confidence interval
pp.lm0 <- predict(soy.lm0,
newdata = data.frame(lrr = mean(soy.o$lrr)),
interval = "confidence")
## Individual observation confidence interval (prediction)
ppp.lm0 <- predict(soy.lm0,
newdata = data.frame(lrr = mean(soy.o$lrr)),
interval = "prediction")
## The auxiliary function pred_int can also be used
## ppp.lm0 <- pred_int(soy.o$lrr)
In the following figure it is clear that the confidence interval is very narrow and it does not represent the true variability in the dataset and the prediction interval is very wide and while it describes the distribution of the data better, its usefulness is questionable.
A second step would be to include ‘trial’ in the model, but still in the context of a linear model. This model tests whether there is an effect of ‘trial’ (in this case it is significant); at least one trial has an effect which is different from the rest.
## Fitting a model with trial (fixed)
## Need to look into defining other contrasts
soy.lm1 <- lm(lrr ~ Trial_ID, data = soy.o)
pp.lm1 <- predict(soy.lm1,
newdata = data.frame(Trial_ID = unique(soy.o$Trial_ID)),
interval = "confidence")
pp.lm1d <- data.frame(trial = unique(soy.o$Trial_ID), pp.lm1)
pp.lm1d2 <- merge(soy.a3, pp.lm1d)
## prediciton interval for the trial which is closer to the mean
pmt <- which.min(abs(pp.lm1d$fit - coef(soy.lm0)))
## The previous is potentially dangerous, because the actual mean might
## be far away from any trial in particular
ppp.lm1 <- predict(soy.lm1,
newdata = data.frame(Trial_ID = unique(soy.o$Trial_ID)[pmt]),
interval = "prediction")
ppp.lm1 <- as.data.frame(ppp.lm1)
The blue intervals in the folowing figure represent the fit from a linear model which includes trial as fixed. Since we assume there is a common variance among the trials the confidence intervals are narrower for some cases and a bit wider in other cases. The new line (“lm1 pred single trial”) is the prediction interval for a specific trial which was close to the estimate of the population mean(\(\hat{\mu}\)). It is much wider than the confidence interval for \(\hat{\mu}\) but narrower than the prediction interval which ignored the structure of the data (lm0).
The main difference now that we analyze the data combined is that the confidence intervals for individual trials are narrower (blue dots and errorbars). The reason is that we assume homogeneous variance and we use the data from all the trials to estimate a single variance. (This is reasonable). For some of the trials the precision in the confidence interval has improved (it is narrower), but for others it is slightly wider. Since the sample size for each trial is different, the intervals are of different widths. Had the sample sizes for all trials been the same, then the width of the intervals would have been exactly the same for all trials. The prediciton interval for a single trial (close to the average) is shown and it is narrower than for the intercept of the lm0 (no trial) model.
We have more or less exhausted what we can do using linear models for this example (not really). Now we move on to mixed models and we consider ‘trials’ as ‘random’. This means that we are less focused on what has happened on each individual trial and we consider trials as being ‘sampled’ from a larger population. (Also see the concept of exchangeability) When considering factors as fixed we assume that we can replicate that level, but here each trial is an ocurrence which we cannot easily replicate. For a distinction between ‘fixed’ and ‘random’ and other issues for mixed models see: https://bbolker.github.io/mixedmodels-misc/glmmFAQ.html
Random effects models are a way of incorporating the effect of ‘trial’ but with an imposed distribution. Commonly it is assumed that they are normally distributed but other options are possible.
\[ y_i = \mu + \theta_i + e_i \] \(y_i\) is the response variable (here, log of the response ratio), \(\mu\) is the intercept (or population mean), \(\theta_i\) is the random effect of each trial (assumed to have zero mean and variance \(\tau^2\)), \(e_i\) is the residual which accounts for the within trial variability (assumed to have zero mean and variance \(\sigma^2\)). Although the distribution of \(\theta_i\) and \(\e_i\) is typically assumed to be normal (Gaussian), this is an assumption that can be relaxed.
soy.lme <- lmer(lrr ~ 1 + (1|Trial_ID), data = soy.o)
round(exp(fixef(soy.lme)),3)
## (Intercept)
## 1.014
round(exp(confint(soy.lme, quiet = T))[3,],3)
## 2.5 % 97.5 %
## 0.983 1.046
##For these type of models there are different methods for
##computing confidence intervals ("profile","Wald","boot").
## see ?confint.merMod for more details
## Confidence intervals
ci.lme <- confint(soy.lme, quiet = T)[3,]
pp.lme <- c(fixef(soy.lme),ci.lme[1],ci.lme[2])
These results point to a 1.4% increase in yield when using the narrower row spacing (\(\hat{\mu}\)). The confidence interval provides additional support that this is a small increase. The point estimates for the standard deviations are \(\hat{\tau} =\) 0.0602765 and \(\hat{\sigma} =\) 0.058963 . The intra-class correlation is \(0.06028^2/(0.06028^2 + 0.05896^2) = 0.51\). Later, the intra-class correlation will be a very important quantity and concept. Another quantity of interest is the heterogeneity ratio (\(\frac{\tau^2}{\sigma^2}\)); which in this case it is \(0.06028/0.05896 = 1.022388\)
The confidence interval for the intercept of the model is wider than the one computed with the linear model (lm0), but its location is much closer to zero to the point that it overlaps the zero line. It is wider because the trial variance is incoporated in this uncertainty. The effect is much smaller than what we had anticipated before. This analysis has the characteristic that it gives less weight to extreme observations.
Following Higgins et al. (2009) we can calculate the prediction interval. This is a known result applied to meta-analysis. For the general case, see Chapter 0.
\[ \hat{\mu} \pm t^{\alpha}_{k-2} \sqrt{\hat{\tau}^2 + SE(\hat{\mu})^2} \] \[ \hat{\mu} = \text{estimated mean} \] \[ \hat{\tau} = \text{estimated between studies std. deviation} \] \[ SE(\hat{\mu}) = \text{estimated standard error for the intercept} \]
Here we compute the prediction interval using Higgins formula with the helper function ‘pred_int’ method = ‘tdist’ and there is another option for using the bootstrap method, which can be more robust in the precense of outliers.
## pred_int is a helper function which
## decluters things here for details see chapter 0
p.soy.lme <- pred_int(soy.lme, method = "tdist")
## The boostrap can be more robust to outliers
p.soy.lme.b <- pred_int(soy.lme, method = "boot")
p.soy.lme.d <- data.frame(funct = rep("lmer",2),
method = c("tdist","boot"),
fit = c(p.soy.lme[1],p.soy.lme.b[1]),
lwr = c(p.soy.lme[2],p.soy.lme.b[2]),
upr = c(p.soy.lme[3],p.soy.lme.b[3]),
row.names = NULL)
## The bootstrapped result is very similar to the
## simpler tdist method.
A fact that is somewhat unexpected is that the prediction interval does not simply overlap the distribution of the trial means. It is pulled toward zero and it does not include the trial which has a more extreme positive value.
There are many reasons for considering a Bayesian model for meta-analysis. An important one is that the between and within variances can be underestimated in the REML-based mixed model method (Gelman 2006). In some cases, including (weakly) informative priors for the variances can produce more realistic results. Other quantities of interest such as the confidence intervals for individual trials or prediction intervals are more easily derived.
Some references: Chung et al. “Avoiding zero between-study variance estimates in random-effects meta-analysis”. Statistics in Medicine. (2013). DOI: 10.1002/sim.5821
Gelman, Andrew: “Prior distributions for variance parameters in hierarchical models”. Bayesian Analysis. (2006).
In this first example we are using the pacakge ‘MCMCglmm’. We provide weak priors (non-informative).
prior1 <- list(B = list(mu = 0, V = 10),
G = list(G1 = list(V = 1, nu = 0.002)),
R = list(V = 1, nu = 0.002))
soy.mc <- MCMCglmm(lrr ~ 1, random = ~ Trial_ID,
prior = prior1, pr = TRUE, burnin = 5000,
nitt = 50000, thin = 2,
data = soy.o, verbose = FALSE, DIC = TRUE)
soy.mc.ci <- as.vector(summary(soy.mc)$solutions)[1:3]
## soy.mc.ci <- predict(soy.mc, interval = "confidence")[1,]
## Those two lines are equivalent
## With non-informative priors the results are almost identical
## to the lme solution
round(exp(rbind(pp.lme, soy.mc.ci)),3)
## (Intercept) 2.5 % 97.5 %
## pp.lme 1.014 0.983 1.046
## soy.mc.ci 1.014 0.982 1.049
## Calculating a predictive interval using MCMCglmm
## There are many different ways of doing this
## and they give similar results
soy.mc.pi0 <- pred_int(soy.mc, method = "tdist")
soy.mc.pi1 <- pred_int(soy.mc, method = "mcmc")
soy.mc.pi2 <- pred_int(soy.mc, method = "simulate")
soy.mc.pi3 <- pred_int(soy.mc, method = "predict")
## 'new_trial' method
soy.mc.pi4 <- pred_int_mcg(soy.o) ## "method = 'ntrial'"
## I pick one but they all give similar answers
soy.mc.pi <- soy.mc.pi3
soy.mc.col <- as.data.frame(rbind(soy.mc.pi0,
soy.mc.pi1,
soy.mc.pi2,
soy.mc.pi3,
soy.mc.pi4))
mmthd <- c("tdist","mcmc","simulate","predict","ntrial")
soy.mc.col.d <- data.frame(funct = "MCMCglmm",
method = mmthd, soy.mc.col,
row.names = NULL)
soy.mcpi <- rbind(p.soy.lme.d, soy.mc.col.d)
kable(cbind(soy.mcpi[,1:2],round(exp(soy.mcpi[,3:5]),3)),
caption = "Different methods for computing prediction intervals")
| funct | method | fit | lwr | upr |
|---|---|---|---|---|
| lmer | tdist | 1.014 | 0.889 | 1.157 |
| lmer | boot | 1.015 | 0.889 | 1.159 |
| MCMCglmm | tdist | 1.014 | 0.879 | 1.170 |
| MCMCglmm | mcmc | 1.013 | 0.887 | 1.159 |
| MCMCglmm | simulate | 1.014 | 0.914 | 1.124 |
| MCMCglmm | predict | 1.014 | 0.894 | 1.153 |
| MCMCglmm | ntrial | 1.013 | 0.886 | 1.160 |
These methods require an explanation. The ‘lmer-tdist’ method extracts the REML-based variance estimates and uses the standard prediction interval equation from Higgins et al. (2009). The ‘lmer-boot’ computes the same statistic but it uses the bootstrap method (“lmer::bootMer”). The idea is that the bootstrap method can be better in the presence of outliers. However, it is computationally intensive and by default it runs 500 times. The ‘MCMCglmm-tdist’ method uses the same equation but it extracts the variance estimates from the Markov chain Monte Carlo (MCMC) Bayesian model. The ‘MCMCglmm-mcmc’ method extracts the posterior distribution for \(\mu\) and \(\tau\) and it simulates deviations from a normal distribution for the trial effects (using \(\tau_j\), j is the \(j^{th}\) mcmc sample) and then it simply adds these two posteriors. The ‘MCMCglmm-simulate’ method creates a posterior distribution for a new (out-of-sample) trial effect (\(\theta_{new}\)) and it adds this to the posterior distribution of \(\mu\). The ‘MCMCglmm-predict’ method is the most computationally intensive and it simulates the distribution of all trial effects and then it averages over all the samples. (See the function ‘predict.MCMCglmm’, with options, marginal = NULL and interval = ‘prediction’). These last two methods seem to generate very similar results, but ‘simulate’ is much more efficient. The ‘MCMCglmm-ntrial’ method adds a new trial to the data.frame and runs the model, it extracts the posterior distribution of \(\theta_{new}\) and then it adds it to the posterior distribution of \(\mu\). Conceptually, it is the same as method ‘simulate’ but in practice it does not seem to account for within trial variability. The subtle difference between these last three methods tends to have important consequences in edge cases.
The main thing to notice is the difference in the variance of a trial effect in the sample (“theta_i*“) and that of one outside our sample (”theta_new“). The relative location of both distributions is not important here, only their spread.
There are many alternatives (and software options) for Bayessian analysis. One of these alternatives is based on the BUGS language (cite). In R an implementation of this approach is through the connection with JAGS (Just Another Gibbs Sampler) and the ‘rjags’ package (other packages are available). In this case we define the model in terms of the assumptions about the probability distributions of each parameter in the model.
soyd <- list(lrr = soy.o$lrr, trial = soy.o$Trial_ID,
t.n = length(unique(soy.o$Trial_ID)),
n = nrow(soy.o))
init <- list(mu = 0, tau = 0.01, tau.trial = 0.01)
modelstring="
model {
# Single intercept model likelihood
for (i in 1:n) {
lrr[i]~dnorm(mu + b[trial[i]],tau)
}
# trial effect
for(j in 1:t.n){
b[j] ~ dnorm(0, tau.trial)
}
# priors
mu ~ dnorm(0,0.00001) # intercept prior
tau ~ dgamma(0.0001,0.0001) ## tau is the residual precision
sigma <- 1.0/sqrt(tau)
tau.trial ~ dgamma(0.0001, 0.0001)
sigma.trial <- 1.0/sqrt(tau.trial)
# generate predictions
beff ~ dnorm(0, tau.trial)
pred <- mu + beff
}
"
mdl=jags.model(textConnection(modelstring), data=soyd, inits=init,
n.chains = 2)
## Compiling model graph
## Resolving undeclared variables
## Allocating nodes
## Graph information:
## Observed stochastic nodes: 119
## Unobserved stochastic nodes: 22
## Total graph size: 289
##
## Initializing model
update(mdl,n.iter = 10000, n.burnin=5000)
output=coda.samples(model=mdl,
variable.names=c("mu","pred","sigma","sigma.trial"),
n.iter=20000, thin=10)
gelman.diag(output) ## Should be close to 1 to assume convergence
## Potential scale reduction factors:
##
## Point est. Upper C.I.
## mu 1 1
## pred 1 1
## sigma 1 1
## sigma.trial 1 1
##
## Multivariate psrf
##
## 1
print(summary(output))
##
## Iterations = 10010:30000
## Thinning interval = 10
## Number of chains = 2
## Sample size per chain = 2000
##
## 1. Empirical mean and standard deviation for each variable,
## plus standard error of the mean:
##
## Mean SD Naive SE Time-series SE
## mu 0.03821 0.133880 2.117e-03 3.269e-02
## pred 0.03648 0.198481 3.138e-03 2.805e-02
## sigma 0.05945 0.004123 6.518e-05 6.763e-05
## sigma.trial 0.08673 0.132808 2.100e-03 2.945e-02
##
## 2. Quantiles for each variable:
##
## 2.5% 25% 50% 75% 97.5%
## mu -0.01790 0.003473 0.01499 0.02615 0.57119
## pred -0.12771 -0.026282 0.01558 0.05923 0.31484
## sigma 0.05196 0.056621 0.05924 0.06208 0.06821
## sigma.trial 0.04332 0.054554 0.06206 0.07194 0.57030
plot(output, trace = FALSE)
plot(output, density = FALSE)
At this point I have introduced many methods for computing prediction intervals and in this case there is a strong agreement among the different approaches. This, however, does not prove that they are correct, it is possible for all of them to be incorrect, or in other words, they do not have the claimed alpha-coverage for a new observation (out-of-sample prediction).
| funct | method | fit | lwr | upr |
|---|---|---|---|---|
| lmer | tdist | 1.014 | 0.889 | 1.157 |
| lmer | boot | 1.015 | 0.889 | 1.159 |
| MCMCglmm | tdist | 1.014 | 0.879 | 1.170 |
| MCMCglmm | mcmc | 1.013 | 0.887 | 1.159 |
| MCMCglmm | simulate | 1.014 | 0.914 | 1.124 |
| MCMCglmm | predict | 1.014 | 0.894 | 1.153 |
| MCMCglmm | ntrial | 1.013 | 0.886 | 1.160 |
| jags | mcmc | 1.013 | 0.881 | 1.155 |
| method | sigma.trial | sigma |
|---|---|---|
| lm | NA | 0.1082348 |
| lmer | 0.0602765 | 0.0589630 |
| mcmc.glmm | 0.0624307 | 0.0593001 |
| jags | 0.0620579 | 0.0592414 |
There are many examples in which the observed trial variance is smaller than the within trial variance. This is not, in general, what we expect with mixed models. If the sample size is high and this allows us to estimate veriances with good precision we often (not always) expect the ‘between-group’ variance to be larger than the ‘within-group’ variance.
One (somewhat extreme) example where this happens is the ‘Stratego YLD’ foliar fungicide in soybean.
sfs <- read.csv("./ch2/StrategoYldOnSoybean.csv")
sfs$lrr <- log(with(sfs, TRT_Yld/CTR_Yld))
## The lowest value is very different from the rest
summary(sfs$lrr)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -0.250709 -0.009848 0.024904 0.028104 0.063291 0.254443
## Plot by averaging
sfs.a <- aggregate(lrr ~ Trial_ID, data = sfs, FUN = mean)
sfs.a$max <- aggregate(lrr ~ Trial_ID, data = sfs, FUN = max)$lrr
sfs.a$min <- aggregate(lrr ~ Trial_ID, data = sfs, FUN = min)$lrr
## Notice that the variance of trial means is very small
cat("Trial means variance: ",var(sfs.a$lrr),"\n")
## Trial means variance: 0.0009763225
## Reorder data for plotting
sfs.ao <- sfs.a[order(sfs.a$lrr),]
sfs.ao$trial <- 1:nrow(sfs.ao)
sfs.lm0 <- lm(lrr ~ 1, data = sfs)
sfs.lm <- lm(lrr ~ Trial_ID, data = sfs)
sfs.lmer <- lmer(lrr ~ 1 + (1 | Trial_ID), data = sfs)
## with this particular dataset the between trial variance is 0
In reality the following should also apply when the between trial variance is not exactly zero but very close.
Option 0: we give up and decide these data is not worth analyzing.
Option 1: we decide that the ‘true’ model is a model without the trial effect. The trial effect ‘does not exist’; “there is no spoon”. In this case we can only compute two intervals the ‘confidence interval’ and the ‘prediction interval’. Predicting a ‘new’ trial does not make sense because ‘there is no trial’. We can only make inferences about the model parameter \(\mu\) or an observation \(y\). The prediction interval is about \(y_{new}\); not about a new trial.
Option 2: we decide that the ‘true’ model is one with a trial effect, but we were not able to estimate it well with these data. We could try to fix some value for the between trial variance, but this arbitrary and not a reasonable option. Another approach is to use the bootstrap method, perhaps by resampling these data we can obtain a better estimate of the between trial variance. This option still results in a very small number for the variance (or standard deviation).
btv <- function (x) VarCorr(x)$Trial_ID[[1]]
boot.btv <- suppressWarnings(bootMer(sfs.lmer, btv, nsim = 500))
quantile(sqrt(boot.btv$t), c(0.025,0.5,0.975))
## 2.5% 50% 97.5%
## 0.000000e+00 6.439038e-09 1.941672e-02
Option 3: we decide that the ‘true’ model should have a trial effect, but as in option 2 we recognize that these data does not allow us to estimate it. The difference is that now we are using a Bayesian approach. However, this is more involved and it requires developing informative priors for this parameter and providing informative priors for a variance is not intuitive. One option is to estimate the between trial variance from other datasets. Also, even if we create weakly informative priors the estimated variance is likely to be small but not exactly zero.
| HRs | belief | action |
|---|---|---|
| HR = 1 | trial effect exists | just analyze the data |
| HR = 0.75 | trial effect likely to exist | we might need more data |
| HR = 0.5 | trial effect may exist | we need more data |
| HR = 0.25 | trial effect likely to not exist | are we going to drop the trial effect? |
| HR = 0 | trial effect does not exist | just drop the trial effect |
| HRs | belief | action |
|---|---|---|
| HR = 1 | trial effect exists (no need for priors) | just analyze the data |
| HR = 0.75 | trial effect exist (we might need priors) | are prediction intervals good? |
| HR = 0.5 | trial effect exist (consider better priors) | prediction intervals are not good |
| HR = 0.25 | trial effect exist (we need better priors) | we need more data |
| HR = 0 | trial effect exist (we really need better priors) | we need more data and priors |
set.seed(123456789)
prior1 <- list(R = list(V = 1, nu = 0.001), G = list(G1=list(V = 1, nu = 0.001)))
## These are 'non-informative' priors
sfs.mcmc1.2 <- MCMCglmm(lrr ~ 1, random = ~Trial_ID, data = sfs,
prior = prior1, pr = TRUE, verbose = FALSE)
sfs.mcmc2.2 <- MCMCglmm(lrr ~ 1, random = ~Trial_ID, data = sfs,
prior = prior1, pr = TRUE, verbose = FALSE)
sfs.mcmc3.2 <- MCMCglmm(lrr ~ 1, random = ~Trial_ID, data = sfs,
prior = prior1, pr = TRUE, verbose = FALSE)
sfs.mcmc.v2 <- mcmc.list(sfs.mcmc1.2$VCV, sfs.mcmc2.2$VCV, sfs.mcmc3.2$VCV)
sfs.mcmc.m2 <- mcmc.list(sfs.mcmc1.2$Sol, sfs.mcmc2.2$Sol, sfs.mcmc3.2$Sol)
gelman.diag(sfs.mcmc.v2)
## Potential scale reduction factors:
##
## Point est. Upper C.I.
## Trial_ID 1 1.01
## units 1 1.00
##
## Multivariate psrf
##
## 1
plot(sfs.mcmc.v2, density = FALSE)
set.seed(123456)
prior2 <- list(R = list(V = 0.0013978, nu = 1.492583),
G = list(G1=list(V = 0.0013978*diag(1), nu = 0.492583)))
## These are 'non-informative' priors
sfs.mcmc1.3 <- MCMCglmm(lrr ~ 1, random = ~Trial_ID, data = sfs,
prior = prior2, pr = TRUE, verbose = FALSE)
sfs.mcmc2.3 <- MCMCglmm(lrr ~ 1, random = ~Trial_ID, data = sfs,
prior = prior2, pr = TRUE, verbose = FALSE)
sfs.mcmc3.3 <- MCMCglmm(lrr ~ 1, random = ~Trial_ID, data = sfs,
prior = prior2, pr = TRUE, verbose = FALSE)
sfs.mcmc.v3 <- mcmc.list(sfs.mcmc1.3$VCV, sfs.mcmc2.3$VCV, sfs.mcmc3.3$VCV)
sfs.mcmc.m3 <- mcmc.list(sfs.mcmc1.3$Sol, sfs.mcmc2.3$Sol, sfs.mcmc3.3$Sol)
gelman.diag(sfs.mcmc.v3)
## Potential scale reduction factors:
##
## Point est. Upper C.I.
## Trial_ID 1 1.01
## units 1 1.01
##
## Multivariate psrf
##
## 1
plot(sfs.mcmc.v3, density = FALSE)
Based on these data alone it is difficult to decide which interval is the ‘correct’ one. If the goal is to have a prediction interval that it likely to contain a ‘new trial’ then, maybe, there are some we can discard. First, ‘lmer-tdist’ seems to be too narrow, it currently only includes just a few of the observed trials, so if the sample of trial means looks similar as the one we obtained it is too narrow. The method ‘lmer-boot’ is wider than ‘lmer-tdist’ and it is likely to have better coverage of future trial means, but it does not seem wide enough. After these two we have two groups: the first one includes ‘mcglm-tdist’, ‘mcglm-mcmc’ and ‘mcglm-ntrial’. The support for these intervals being correct is that we believe that the observed trial means are overdispersed and that a future collection of data will be included in these intervals. In the second group we have ‘mcglm-simulate’, ‘mcglm-predict’, ‘lm-tdist’ and ‘lmer-tdist2’. The widths of these intervals are surprisingly similar, but the methods behind the calculations are very different. The simplest one is ‘lm-tdist’ which is basically a model in which we removed the effect of trial (fixed-effects model) and now the interpretation of the interval is that it will include future \(y_{new}\) observations, since trials ‘do not exist’. The methods ‘mcglm-simulate’ and ‘mcglm-predict’ use similar approaches, but they seems a bit too wide. They are more likely to contain at least 95% of future trial means than the first group if the future looks similar as the observed data. The method ‘lmer-tdist2’ is a new experimental equation. If we are forced to make a visual statement, it would seems like the ‘correct’ interval would be somewhere between the first group and the second group. The first group is too narrow and the second group is too wide. We need more evidence to support these visual impressions.
## Coverage for method 'mcglm-ntrial' 89.2 %
## Coverage for method 'mcglm-simulate' 100 %
This is the coverage of these intervals for the ‘current’ trials, but these intervals are aimed at including an observation from a trial which is not included here; a “theta_new”. It would make sense that the wider interval will be more appropriate, since it includes 100% of the current trials, it is reasonable that it might include only 95% of future trials, but there is no guarantee about this. It is less likely that the interval that includes 89% of the ‘current’ observed trial means will include 95% of the ‘future’ trial means, although not impossible, depending on what the ‘true’ model is.
For completeness we also fit this model using JAGS.
sfsd <- list(lrr = sfs$lrr, trial = sfs$Trial_ID,
t.n = length(unique(sfs$Trial_ID)),
n = nrow(sfs))
init <- list(mu = 0, tau = 0.01, tau.trial = 0.01)
modelstring="
model {
# Single intercept model likelihood
for (i in 1:n) {
lrr[i]~dnorm(mu + b[trial[i]],tau)
}
# trial effect
for(j in 1:t.n){
b[j] ~ dnorm(0, tau.trial)
}
# priors
mu ~ dnorm(0,0.00001) # intercept prior
tau ~ dgamma(0.0001,0.0001) ## tau is the residual precision
sigma <- 1.0/sqrt(tau)
tau.trial ~ dgamma(0.0001, 0.0001)
sigma.trial <- 1.0/sqrt(tau.trial)
# generate predictions
beff ~ dnorm(0, tau.trial)
pred <- mu + beff
}
"
mdl=jags.model(textConnection(modelstring), data=sfsd, inits=init,
n.chains = 2)
## Compiling model graph
## Resolving undeclared variables
## Allocating nodes
## Graph information:
## Observed stochastic nodes: 200
## Unobserved stochastic nodes: 41
## Total graph size: 489
##
## Initializing model
update(mdl,n.iter = 10000, n.burnin=5000)
output=coda.samples(model=mdl,
variable.names=c("mu","pred","sigma","sigma.trial"),
n.iter=20000, thin=10)
print(summary(output))
##
## Iterations = 10010:30000
## Thinning interval = 10
## Number of chains = 2
## Sample size per chain = 2000
##
## 1. Empirical mean and standard deviation for each variable,
## plus standard error of the mean:
##
## Mean SD Naive SE Time-series SE
## mu 0.02782 0.005004 7.911e-05 7.912e-05
## pred 0.02748 0.015255 2.412e-04 2.453e-04
## sigma 0.06240 0.003282 5.190e-05 5.190e-05
## sigma.trial 0.01352 0.005027 7.948e-05 1.138e-04
##
## 2. Quantiles for each variable:
##
## 2.5% 25% 50% 75% 97.5%
## mu 0.018137 0.024468 0.02778 0.03119 0.03777
## pred -0.004345 0.018758 0.02764 0.03685 0.05779
## sigma 0.056370 0.060140 0.06230 0.06454 0.06901
## sigma.trial 0.006081 0.009722 0.01276 0.01654 0.02523
plot(output, trace = FALSE)
plot(output, density = FALSE)
| mu | sigma.trial | sigma | |
|---|---|---|---|
| lm | 0.02810 | NA | 0.06241 |
| lmer | 0.02810 | 0.00000 | 0.06297 |
| mcmc.s.priors | 0.02771 | 0.01910 | 0.06205 |
| mcmc.p.priors | 0.02780 | 0.01725 | 0.06192 |
| rjags | 0.02782 | 0.01352 | 0.06240 |
## Warning: 'cBind' is deprecated.
## Since R version 3.2.0, base's cbind() should work fine with S4 objects
| funct | mu | lwr | upr |
|---|---|---|---|
| lm | 0.0281 | 0.0193 | 0.0369 |
| lmer | 0.0281 | 0.0194 | 0.0368 |
| MCMCglmm | 0.0278 | 0.0180 | 0.0389 |
| jags | 0.0278 | 0.0181 | 0.0378 |
| funct | method | mu | lwr | upr |
|---|---|---|---|---|
| lm | tdist | 0.0281 | -0.0964 | 0.1526 |
| lmer | tdist | 0.0281 | 0.0191 | 0.0371 |
| lmer | boot | 0.0283 | 0.0116 | 0.0451 |
| MCMCglmm | tdist | 0.0278 | -0.0083 | 0.0640 |
| MCMCglmm | mcmc | 0.0285 | -0.0052 | 0.0654 |
| MCMCglmm | simulate | 0.0265 | -0.0917 | 0.1504 |
| MCMCglmm | predict | 0.0277 | -0.0973 | 0.1518 |
| MCMCglmm | ntrial | 0.0279 | -0.0153 | 0.0712 |
| jags | mcmc | 0.0276 | -0.0043 | 0.0578 |
The previous figure shows that there are many different ways of computing the prediction interval, but it does not point to which one is correct. The ‘narrow’ ones seem ‘too’ ‘narrow’ and the ‘wide’ ones seem ‘too’ wide. Are the ones in the middle the correct ones?
This part is a work in progress. I’m working on simulations to understand this better.
soy.m1i <- aggregate(TRT1_Yld ~ Trial_ID, data = soy.o, FUN = mean)[,2]
soy.m2i <- aggregate(TRT2_Yld ~ Trial_ID, data = soy.o, FUN = mean)[,2]
soy.sd1i <- aggregate(TRT1_Yld ~ Trial_ID, data = soy.o, FUN = sd)[,2]
soy.sd2i <- aggregate(TRT2_Yld ~ Trial_ID, data = soy.o, FUN = sd)[,2]
soy.n1i <- aggregate(TRT1_Yld ~ Trial_ID, data = soy.o, FUN = length)[,2]
soy.n2i <- aggregate(TRT2_Yld ~ Trial_ID, data = soy.o, FUN = length)[,2]
rr.escl <- escalc("ROM", m1i = soy.m1i, m2i = soy.m2i,
sd1i = soy.sd1i, sd2i = soy.sd2i,
n1i = soy.n1i, n2i = soy.n2i)
soy.rma <- rma(yi, vi, data = rr.escl, test = "knha")
forest(soy.rma)
funnel(soy.rma)
## Comparing methods for the estimation of tau
mthds <- c("REML","DL","HE","HS","SJ","ML","REML","EB","PM")
tau.col <- numeric(length(mthds))
I2.col <- numeric(length(mthds))
for(i in 1:length(mthds)){
tmp <- rma(yi, vi, data = rr.escl, method = mthds[i])
tau.col[i] <- tmp$tau2
I2.col[i] <- tmp$I2
}
col.c <- data.frame(method = mthds, tau = tau.col, I2 = I2.col)
col.c
## method tau I2
## 1 REML 0.0025645157 91.76571
## 2 DL 0.0011192107 82.94572
## 3 HE 0.0027860419 92.37049
## 4 HS 0.0009599012 80.66264
## 5 SJ 0.0028436097 92.51338
## 6 ML 0.0023588147 91.11146
## 7 REML 0.0025645157 91.76571
## 8 EB 0.0027249670 92.21280
## 9 PM 0.0027320625 92.23146
## Caterpillar plot
## Adapted from
## http://www.metafor-project.org/doku.php/plots:caterpillar_plot
### create plot
forest(rr.escl$yi, rr.escl$vi,
xlim=c(-0.2,0.3), ### adjust horizontal plot region limits
subset=order(rr.escl$yi), ### order by size of yi
slab=NA, annotate=FALSE, ### remove study labels and annotations
efac=0, ### remove vertical bars at end of CIs
pch=19, ### changing point symbol to filled circle
col="gray40", ### change color of points/CIs
psize=1, ### increase point size
cex.lab=1, cex.axis=1, ### increase size of x-axis title/labels
lty=c("solid","blank")) ### remove horizontal line at top of plot
### draw points one more time to make them easier to see
points(sort(rr.escl$yi), nrow(rr.escl):1, pch=19, cex=0.5)
### add summary polygon at bottom and text
addpoly(soy.rma, mlab="", annotate=FALSE, cex=1, row = 1, col = "red")
text(-0.1, 1, "RE Model", pos=4, offset=0, cex=1, col = "red")